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Abstract. We seek to perform efficient queries for the predecessor among 
n values stored in k sorted arrays. Evading the Q{n\ogk) lower bound 
from merging k arrays, we support predecessor queries in O(logn) time 
after 0(n logconstruction time. By applying Ben-Or’s technique, 
we establish that this is optimal for strict predecessor queries, i.e., ev¬ 
ery data structure supporting 0(logn)-time strict predecessor queries 
requires f2(nlog(construction time. Our approach generalizes as a 
template for deriving similar lower bounds on the construction time of 
data structures with some desired query time. 


1 Introduction 

We are given k sorted arrays Ai, A 2 ,..., Ak storing n values in total. Let A 
be the sorted array that results from merging Ai, A 2 ,. ■., Ak- We would like to 
support efficient queries for the predecessor of any query value q in the array A, 
i.e., for the largest value in A that is smaller than or equal to q. However, we 
would like to accomplish this goal without explicitly constructing A and thereby 
avoiding the lower bound of f^{nlogk) from merging k sorted arrays. 

By combining partial merging with fractional cascading [2], we support 
0(logn)-time predecessor queries in A after 0(nlog(construction. As our 
main contribution, we prove that the resulting data structure is, in fact, optimal 
when considering strict predecessor queries, i.e., queries for the largest entry of A 
that is strictly smaller than the query value q. By applying Ben-Or’s technique [1], 
we establish a lower bound of l7(nlog(j^j|^)) on the construction time of every 
data structure that supports strict predecessor queries in O(logn) time. 

We are interested in lower bounds on the construction time of data structures 
for predecessor search in multiple arrays, because we wish to derive lower bounds 
on the construction time of more complex data structures. 


2 Related Work 

While lower bounds for predecessor search have been extensively studied in various 
models of computation, such as the cell probe model [6], we are unaware of any 
results regarding the version studied in this work. One variant of predecessor 
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search that comes close is the setting of fractional cascading, where we seek to 
identify the predecessor of a query value in each array as opposed to the overall 
predecessor. Chazelle and Guibas [2] support simultaneous predecessor queries in 
k sorted arrays in 0(fc + logn) time after 0{n) construction. When k = O(logn), 
fractional cascading solves our version of predecessor search optimally. 

Ben-Or’s technique [1] works as follows: we formulate a problem as a question 
“x G W?” for some set W C and then bound the height of any algebraic 
computation tree deciding this membership question by bounding the number 
of connected components of W. Ben-Or [1] improved known lower bounds, 
e.g., for the knapsack problem [3], and established new ones for a variety of 
problems including element distinctness and geometric constructions with ruler 
and compass. Sacristan [5] summarizes the results related to Ben-Or’s technique. 


3 An Upper Bound 

We divide the sorted arrays Ai, A 2 , .. ■, Ak into groups of size s. Then, we merge 
the s arrays in the j-th group into one sorted array Bj, e.g., by maintaining a 
min-heap of size s storing for each array the smallest entry that has yet to be 
inserted into Bj. Finally, we apply fractional cascading [2] on i?i, ..., 

This construction takes O(nlogs) time and occupies 0(n) space. We answer a 
predecessor query for q in 0{k/s + logn) time by determining the predecessors 
pi,... ,p\k/.s\ of q in each Bi, B 2 , ■ ■ ■, , respectively; the largest of these 

values is the predecessor of q in A. When k = w(logn), we obtain a query time of 
O(logn) and a construction time of 0(nlog(by choosing s = 0 (fc/logn). 


4 A Lower Bound 


Our general approach is as follows. Let T(n,m) be the total time required for 
answering a sequence of m queries. Assume we have a data structure with con¬ 
struction time C{n) supporting queries in Q{n) time. Answering m = ln/Q{n)\ 
queries takes T{n, m) < mQ{n) + C{n) < n + C(n) time. Therefore, any lower 
bound of T{n, \ n/Q{n)\) = 0{X), with X = w(n), implies a lower bound of 
C{n) = n{X). We shall use Ben-Or’s technique to find a suitable X. 

Consider the following batch verification variant of strict predecessor search: 
We are given k sorted arrays Ai, A 2 , ..., Ak of lengths ni, n 2 ,..., n^, respectively, 
and we are given m query points qi,q 2 ,... ,qm G IR alongside with m supposed 
answers pi,p 2 , ■. ■ ^Pm G IR. We would like to check whether pi is indeed the 
strict predecessor of qi among all values in Ai, A 2 ,... ,Ak for alH = 1,2,..., m. 
This batch verification problem corresponds to the membership problem for 
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According to Ben-Or’s theorem, deciding the membership problem Wm for 
the batch verification problem takes f2(log#Wm — d) time, where #Vbm is the 
number of components of Wm and d = n + 2m is the dimension of Wm- As the 
next step, we establish a lower bound on ^Wm by identifying a certain number 
of points that belong to pairwise distinct connected components of Wm ■ 

To study the structure of Wm, we start with some instance x G Wm of batch 
verification, i.e., a point x G encoding k sorted arrays Ai, A 2 ,... ,Ak, 

queries qi,q2,. ■ ■ ,<lm, and answers pi,p2, ■ ■ ■ ,Pm such that Pi is the strict prede¬ 
cessor of qi among all array entries for i = 1,2,We can continuously move 
some of the entries of x without leaving Wm- For instance, we remain in Wm 
when moving a query without changing its strict predecessor. Other changes, like 
moving the supposed answer pi for query qi without moving the corresponding 
array entry, cause us to leave Wm- The components of Wm consist of instances 
that can reach one another via a continuous deformation without leaving Wm- 



Fig. 1. A distribution of array entries (empty circles) and queries (vertical bars). The 
colors indicate the array containing an entry, i.e., all entries of color i belong to Ai. 
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(a) We can swap array entries that are no strict predecessors. 
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(b) We can swap predecessors entries with non-predecessor entries. 
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(c) We cannot move an array entry through a query. 


Fig. 2. Legal and illegal changes to the order of queries (vertical bars), array entries 
(empty circles), and strict predecessors (color of circle centers). We can swap entries 
and queries as shown in (a) and (b) without leaving W. As depicted in (c), moving an 
array entry through a query point q (or vice versa) froces us out of W, as the strict 
predecessor of q to would have to discontinuously jump to a new position. 


Consider the order of the array entries, query points, and answers of an 
instance x G Wm, as illustrated in Figure 1. We estimate ^Wm by counting 
orders in separate components of Wm- Figure 2 summarizes which changes lead to 
the same component and which changes leave the component. Most importantly, 
we cannot move a query value through an array entry or vice versa without 
causing the corresponding answer to discontinuously jump to a new position. 

To count the components of Wm, we consider the different ways to distribute 
n distinct values xi^X 2 , - - - ,Xn into sorted arrays Ai, A 2 ,. - -, A^ and then we 
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trap these distributions in as many separate components of Wm as possible by 
placing query values. There are ways to distribute n distinct values 

into k sorted arrays of sizes ni,n 2 , ■ • ■, 



Fig. 3. Trapping distributions of the values (circles) to the arrays (colors) using only 
queries (vertical bars). We place one query point every logn entry values. Using 
the legal swaps from Figure 2, we can reorder the entries between two queries. 


As illustrated in Figure 3, we place one query point every logn array entries. 
Since we can swap any two of the logn entries between two queries, the number 
of components shrinks by a factor of at most (logn)! per query compared to 
placing one query between every two array entries, i.e.. 
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where the asymptotic bound follows from Stirling’s formula and from the fact 
that the expression is maximized when the lengths of the arrays are balanced. 

Theorem 1. Consider k sorted arrays Ai, A 2 ,... ,Ak containing n entries in 
total, and let A be the sorted array that results from merging Ai, A 2 ,..., A/j. 
When k = a;(logn), every data structure that supports strict predecessor queries 
in A with a query time of O {log n) requires J7(nlog(j;j|^)) construction time. 


5 Future Work 

In future research, we shall attempt to reestablish our lower bound for non-strict 
predecessor queries, e.g., by augmenting the algebraic computation tree model 
with support for symbolic perturbation [4]. Moreover, we shall apply our approach 
to derive new lower bounds for the construction of other data structures. 
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